Can General-Purpose Compression Schemes Really Compress DNA Sequences?

نویسندگان

  • Toshiko Matsumoto
  • Kunihiko Sadakane
  • Hiroshi Imai
  • Takumi Okazaki
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genbit Compress Tool(GBC): A Java-Based Tool to Compress DNA Sequences and Compute Compression Ratio(bits/base) of Genomes

We present a Compression Tool , GenBit Compress”, for genetic sequences based on our new proposed “GenBit Compress Algorithm”. Our Tool achieves the best compression ratios for Entire Genome (DNA sequences) . Significantly better compression results show that GenBit compress algorithm is the best among the remaining Genome compression algorithms for non-repetitive DNA sequences in Genomes. The ...

متن کامل

Biological sequence compression algorithms.

Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The s...

متن کامل

DNABIT Compress – Genome compression algorithm

Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits...

متن کامل

GenomeCompress: A Novel Algorithm for DNA Compression

The genome of an organism contains all hereditary information encoded in DNA. So it is extremely important to sequence the genome which determines how the organisms survive, develop and multiply. Since three decades, due to massive efforts on DNA sequencing, complete genome sequence of a large number of organisms including humans are now known and the genomic databases are growing exponentially...

متن کامل

Compression for Similarity Queries

Traditionally, data compression deals with the problem of concisely representing a data source, e.g. a sequence of letters, for the purpose of eventual reproduction (either exact or approximate). In this work we are interested in the case where the goal is to answer similarity queries about the compressed sequence, i.e. to identify whether or not the original sequence is similar to a given quer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000